Directed Partial Correlation: Inferring Large-Scale Gene Regulatory Network through Induced Topology Disruptions
نویسندگان
چکیده
Inferring regulatory relationships among many genes based on their temporal variation in transcript abundance has been a popular research topic. Due to the nature of microarray experiments, classical tools for time series analysis lose power since the number of variables far exceeds the number of the samples. In this paper, we describe some of the existing multivariate inference techniques that are applicable to hundreds of variables and show the potential challenges for small-sample, large-scale data. We propose a directed partial correlation (DPC) method as an efficient and effective solution to regulatory network inference using these data. Specifically for genomic data, the proposed method is designed to deal with large-scale datasets. It combines the efficiency of partial correlation for setting up network topology by testing conditional independence, and the concept of Granger causality to assess topology change with induced interruptions. The idea is that when a transcription factor is induced artificially within a gene network, the disruption of the network by the induction signifies a genes role in transcriptional regulation. The benchmarking results using GeneNetWeaver, the simulator for the DREAM challenges, provide strong evidence of the outstanding performance of the proposed DPC method. When applied to real biological data, the inferred starch metabolism network in Arabidopsis reveals many biologically meaningful network modules worthy of further investigation. These results collectively suggest DPC is a versatile tool for genomics research. The R package DPC is available for download (http://code.google.com/p/dpcnet/).
منابع مشابه
An empirical Bayes approach to inferring large-scale gene association networks
MOTIVATION Genetic networks are often described statistically using graphical models (e.g. Bayesian networks). However, inferring the network structure offers a serious challenge in microarray analysis where the sample size is small compared to the number of considered genes. This renders many standard algorithms for graphical models inapplicable, and inferring genetic networks an 'ill-posed' i...
متن کاملDiscovery of meaningful associations in genomic data using partial correlation coefficients
MOTIVATION A major challenge of systems biology is to infer biochemical interactions from large-scale observations, such as transcriptomics, proteomics and metabolomics. We propose to use a partial correlation analysis to construct approximate Undirected Dependency Graphs from such large-scale biochemical data. This approach enables a distinction between direct and indirect interactions of bioc...
متن کاملSmall-Sample Analysis and Inference of Networked Dependency Structures from Complex Genomic Data
plications in Genetics and Molecular Biology 4: Article 32. Juliane Schäfer und Korbinian Strimmer. 2005. An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21:754–764. Juliane Schäfer und Korbinian Strimmer. 2005. Learning large-scale graphical Gaussian models from genomic data. Summary The present work is concerned with modeling and inferring geneti...
متن کاملOptimal topology of gene-regulatory networks: role of the average shortest path
Gene regulatory networks (GRNs) possess an important structural property; they are sparse and resilient, with a robust topology that affords protection against random “attacks” (e.g., gene deletions). However, such networks exhibit optimal or near-optimal topological features not present in other scale-free networks. This paper utilizes an integer linear program formulation to gauge the exact s...
متن کاملOn the Difficulty of Inferring Gene Regulatory Networks: A Study of the Fitness Landscape Generated by Relative Squared Error
Inferring gene regulatory networks from expression profiles is a challenging problem that has been tackled using many different approaches. When posed as an optimization problem, the typical goal is to minimize the value of an error measure, such as the relative squared error, between the real profiles and those generated with a model whose parameters are to be optimized. In this paper, we use ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 6 شماره
صفحات -
تاریخ انتشار 2011